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Abstract 



Vortex methods are used to efficiently simulate incompressible flows using Lagrangian techniques. 
Use of the FMM (Fast Multipole Method) allows considerable speed up of both velocity evaluation 
and vorticity evolution terms in these methods. Both equations require field evaluation of constrained 
(divergence free) vector valued quantities (velocity, vorticity) and cross terms from these. These are 
usually evaluated by performing several FMM accelerated sums of scalar harmonic functions. 

We present a formulation of vortex methods based on the Lamb-Helmholtz decomposition of the 
velocity in terms of two scalar potentials. In its original form, this decomposition is not invariant with 
respect to translation, violating a key requirement for the FMM. One of the key contributions of this 
paper is a theory for translation for this representation. The translation theory is developed by introducing 
"conversion" operators, which enable the representation to be restored in an arbitrary reference frame. 
Using this form, efficient vortex element computations can be made, which need evaluation of just two 
scalar harmonic FMM sums for evaluating the velocity and vorticity evolution terms. Details of the 
decomposition, translation and conversion formulae, and sample numerical results are presented. 

1 Introduction 

Vortex methods are used to simulate the Navier Stokes equation in the velocity-vorticity form with La- 
grangian discretization. Since vortex particles are initially placed only in the region of finite vorticity and 
can convect along with the flow, these methods provide an optimized spatial discretization. Consider an 
incompressible flow generated by a set of N vortex elements, characterized by coordinates of the centers 
(sources) Xj and constant strength vector u)i, % = 1, N. Each element centered at location Xj produces an 
elementary velocity field Vj (y ) according to the Biot-Savart law, and the total velocity field can be computed 
as a superposition of such elementary fields (e.g., see El"): 



In practice this field needs to evaluated at M evaluation points, yj, which has 0(M N) cost. The Biot-Savart 
kernel is composed of a vector of dipole solutions of the Laplace equation. It is well known that the Fast 
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Multipole Method (FMM) can be used to evaluate such sums to any specified accuracy e at a 0(N + M) 
reduced cost JSJ. 

The vortex elements move with the flow. This motion also causes an evolution of the vortex field 
according to the vortex evolution equation. For inviscid flow, the evolution equations for the vortex positions 
and strengths respectively are 

IF = v| *=- ' ~df = s - (2) 

Si = u»i- Vv| x=Xi , v (xi;t) = v i ( x »; *) • 

Here the right hand side for the vortex strength is the so-called vortex stretching term, the evaluation of 
which requires the gradient of the velocity vector. The evolution equation for the vorticity can be modified 
to account for liquid viscosity. Also the elementary velocity field in Eq. ([T} can be modified using a 
smoothing kernel, K{\y — xj ; a), 

Vi (y;a) = UiX (y * l) K(\y - xj ; a), K(r; a) = 1 + O(e), r » a, (3) 

|y - x «l 

which has an effect only on the near field, r < a, where a is the radius of the vortex core, which may 
change in time, and e <C 1 is the tolerance for approximation. This modification does not affect the far field, 
which can also be computed with the tolerance O (e) in an accelerated manner via the FMM. In the sequel 
we do not specify the core function K(r; a); several choices including the Gaussian and polynomial forms 
are discussed in the literature (see e.g., [I9j EH 1211). and there are several ways to speed up the local 
summation as well. Extensions to compressible flow are also possible Q . 

The evolution equation is integrated using an appropriate time stepping scheme. The right hand side 
of this equation, also results in an N body computation for the influence of particles in the far-field, with 
somewhat more complicated terms. As discussed above, the FMM for the vortex element method is closely 
related to the scalar FMM for sums of multipoles of the Laplace equation ("harmonic FMM"). In fact, it 
is possible to start with a program for a harmonic FMM and appropriately modify it to create a fast vortex 
method. In terms of performance, computation of potential gradients and higher derivatives can be referred 
as "auxiliary" computations, which can be done as soon as local expansions for the potentials are available. 
In this sense the question as to how many independent potentials need to be computed to obtain right hand 
sides of the evolution equations ([2]) is important. For example, to compute the three components of the 
gravitational force in an stellar A r -body computation one needs only one harmonic FMM; the gradients 
can be obtained via differentiation of expansions, which is done efficiently by application of sparse matrix 
operators to the potential expansions. Treating the problem in a straightforward way, for the vortex method, 
one should use three independent harmonic FMM sums (for each velocity component). However, because 
of the divergence constraint, one may speculate that it is possible to reduce this number to two. THat this in 
fact is so is a main result of the present paper. 

A similar reduction of the complexity to solution of two harmonic FMMs was obtained for the bihar- 
monic equation |[T3l (opposed to five FMMs using factorization Q). For the Stokes equations, where the 
solution can be decomposed to the sum of Stokeslets and Stresslets, a representation via three harmonic 
potentials (Lamb-Helmholtz decomposition [20]) requires only three harmonic FMMs (see also ED ), while 
a more simple way based on factorization 11221 shows that the evaluation can be done with a cost of four 
harmonic FMM calls. 

In this paper we develop such an efficient version of the FMM for vortex methods, which achieves an 
evaluation of both the velocity and stretching term sums at a cost of only two scalar harmonic FMMs (this 
also can be reduced to one complex valued harmonic FMM since the physical fields are real). Our approach 
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is based on the Lamb-Helmholtz decomposition [20], which allows representation of the vector field in the 
form of two scalar potential fields. This form is however not invariant to translation, and cannot be used as is, 
with an FMM summation algorithm. We develop conversion operators that allow this form to be translated. 

Section 2 of the paper introduces the problem and notation, and shows that the equations can be consid- 
ered to be solutions of a divergence constrained vector Laplace equation. Section 3 develops the translation 
theory for such equations, which is the main mathematical result of the paper. Section 4 shows how the 
new translation theory can be used, together with a harmonic FMM, to create an FMM for vortex methods. 
Section 5 presents the results of numerical testing and some examples of FMM accelerated vortex element 
method computations. Section 6 concludes the paper. Mathematical details are provided in appendices. 



2 Statement of the problem 

We are given N vortex blobs of strength u>i, i = 1 , . . . , N located at points Xj and moving with the flow. 
The velocity field can be evaluated using either Eq. ([!]) or Q, which both have the same asymptotic far- 
field form. The evolution of the vortex positions and the vortex strengths is given by Eqs. ([2]). At y ^ x i; 
i = 1, N, the velocity field v (y) satisfies the divergence constrained vector Laplace equation (DCVLE) 

V 2 v = 0, V • v = 0. (4) 

If the divergence of the field were not constrained, each Cartesian component of the velocity would be an 
independent harmonic function. The divergence constraint, however, reduces the degree of freedom for 
solutions by 1, and, in fact, only two harmonic scalar potentials, <fi and x, are necessary to describe the total 
field inside or outside a sphere centered at the origin 

v(r) = V0(r) + Vx (r X (r)), V 2 = 0, V 2 X = 0. (5) 

This decomposition can be treated as a general Helmholtz decomposition of an arbitrary vector field. Pre- 
sumably, this form is due to Lamb EUl . who used it to obtain a general solution for the Stokes equations in 
spherical coordinates, and we refer to this as the Lamb-Helmholtz decomposition. Indeed, Eq. ([4]) are the 
Stokes equations with zero pressure for which the Lamb solution provides (|5]). 

The DCVLE appears naturally when one attempts to follow a general procedure to reconstruct an arbi- 
trary vector field from given curl, u (r), and divergence, q (r), 

V x v = u>, V • v = q. (6) 

The solution of these equations in free space can be written in the form (e.g. see El): 

v (y> " " Vy X ^A dv (x> + Vy x X STF^" (x) ' <7) 

Subdividing the space to the vicinity of evaluation point y (near field) and the domain outside this neighbor- 
hood (far field) and discretizing the integrals for the far field using quadratures with weights Wi and nodes 
Xj, we obtain for the far field contribution 



v /(y) = L v /iW> v / ,(y) = -V | +Vx- — (8) 

^ |y- x i| |y-xi 



Wiq (xj) _ WiU (xi) 

Hence, the far field satisfies Eq. ([4]) for which decomposition (|5]> can be used and just an addition to 
potential cj> due to a given monopole distribution q (x) provides solution for a general case. As mentioned, 
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in the present paper we do not address computation of the near field, which can be done locally, e.g. using 
appropriate smoothing kernels. Note that solution (|7]) of Eq. ([6]) is unique up to a gradient of a harmonic 
function <E>, which should be found from the boundary conditions. Such functions for a given boundary can 
be added to 4> (r) in Eq. 

Equations ([6]) with q ^ appear, e.g. in vortex methods for compressible flows. An example of equa- 
tions (for 2D) can be found in [5], which can be appropriately modified for 3D. In terms of computational 
complexity, besides the velocity field and stretching term computations, also contraction of the velocity gra- 
dient tensor, /3 = Vv : Vv should be computed in this case. This term can be computed simultaneously with 
computation of the vortex stretching term. Thus the cost in these extended cases should remain the same. 



3 Translation theory for DCVLE 

3.1 Basic translation and differential operators 

Translation operator: A generic translation or shift operator T(t), where t is a constant termed the trans- 
lation vector, acts on some scalar valued function (r) , to produce a new function 4> (r) (the translate), 
whose values coincide with <p (r) at shifted values of the argument 

^(r) = T(t)[0(r)]=Tt^ £(r) = 0(r + t), r,t GM 3 . (9) 

This operator is linear. Also the translates of harmonic functions are also harmonic functions. 

Elementary directional differential operators: We introduce the following notation for differential 
operators which appear in derivations: 

£> r = r-V, £> t = t-V, £> rx t = (r x t) • V. (10) 

It can be shown that if (r) is a harmonic function in some domain, then T> r <p , V t 4>, and V rxt 4> are also 
harmonic functions in the same domain. Note also that operators V t and V rxt are related to an infinitesimal 
translation in the direction of vector t and an infinitesimal rotation about axis t. 

3.2 Conversion operators for the DCVLE 

Consider the translation of the vector v (r) in Using (|9]) for the translated functions, 

v (r) = v (r + t) = V0 (r + t) + V x ((r + 1) X (r + t)) (11) 
= (r) + V X ((r + 1) x (r)) = (r) + V x (r* (r)) + V x (t* (r)) . 

Obviously this is not form (Ji} representing v (r). Our goal is to find harmonic functions (j> and x> which 
provide such a representation, i.e. 

v (r) = + V x (rx) ■ (12) 

For this purpose we introduce "conversion" operators Cy , i, j = 1,2 : 

4> = Cucj) + C12X, X = C2i<P + C 2 2X, (13) 
which are linear due to the linearity of all transforms considered. 



Comparing representations ( |ll| and ( 12 1 we deduce, that eft (r) contributes only to <p (r) , leading to 



C n = X, Cai = 0, (14) 
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where X is the identity operator. So, we can introduce harmonic functions 0' and x' according to the 
following relations 



= + Cux = 4> + 

X = C 2 2X = X + X'- 



(15) 



Having two representations of v, ( 1 1 1 and ( 12 1, and using Eq. ( 15 ), we obtain 



V0' + V x (rx') = V x (tx) • 
Taking scalar product with r and noticing that r • V x (rx') = 0, one can see that 

r • V0' = r • V x (tx) = r • (Vx x t) = - (r x t) • V*. 
Another relation can be obtained if we take the curl of expression ( [To] ): 

V x V x (rx') = V x V x (tx) • 

It is not difficult to check that the following identities hold for the harmonic functions x' an d X : 

V x V x (rx') = V (x' + r • Vx') , 
VxVx(tx) = V(t-Vx). 



(16) 



(17) 



(18) 



(19) 



Note that all scalar potentials are defined up to a constant. Therefore, we obtain from Eqs ( 18 1 and ( 19 1: 

X' + r • Vx' = t • V X . (20) 

Using ( 10 1 we can rewrite relations ( fT7| ) and (20 1 in the form 

D r 0' = -V TXt x, (X + V r ) X ' = T>tX- (21) 

In the next sections, we show that operators V r and (X + V r ) are invertible, and so we can write 

= $-v- x v TXt % 



x = x + {l + V r ) x v t % 
Comparing Eqs ( [T5| ) and ( [22] ), we obtain the following expressions for the conversion operators 

C12 (t) = -V^Vrxt, C22 (t) = X + (X + Vr)' 1 V t . 

3.3 Expansions of harmonic functions 

In addition to Cartesian coordinates we will use spherical coordinates (r, 9, </?): 



(22) 



(23) 



r = (x, y,z) = r (sin 6 cos (p, sin 6 sin <p, cos 1 



r Gil 



(24) 



For expansions of the solutions of the Laplace equation that are regular inside or outside a sphere centered at 
the origin of the reference frame we introduce the regular or local functions R™ (r) and singular or multipole 
functions S™ (r), that are respectively defined as 



iC(r) 



-l) n i |m| 



• n 4 m| (/^ 



mnip 



(n + \m\)\ 
S™ (r) = r ' m|( ^~ |m|)! 4 w| (M)e^ 



fi = cos V, 
, n = 0, 1, 



(25) 



m 



-n. 
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where are the associated Legendre functions, defined by Rodrigues' formula 

(-\) m (\ - u 2 ) m/2 rm+n 
P ™ = 1 lj \ l p. f— U 2 - l) n , m > 0. (26) 
n 2 n n! dfi m+n v ^ ; 

These functions are related to each other via 

S™ (r) = (-l) n+m ( n - m)!(n + m)!^ 2 ™- 1 ^ (r) . (27) 

The functions K£ (r) and (r) defined above coincide with the normalized basis functions I™ (r) and 
O™ (r) considered in (6J and normalized spherical basis functions in ifTBl . The expansion of the Green's 
function in this basis is 



oo n 



(28) 



n=0 m=—n 



Further, we represent harmonic functions in terms of sets of expansion coefficients over a certain basis 
centered at a given point, e.g. the local and multipole expansions centered at the origin are 



oo n oo n 

= E E CiC(r), x(r) = E E XnF?(r), F = R,S. 



(29) 



n=0 m=—n 



Absolute and uniform convergence of these series in the expansion regions is assumed below. We also 
extend the definition of the basis functions for arbitrary order m, to shorten some expressions 



iC (r) = S? (r) = 0, 



\m 



> n, n = 0,1, ... 



(30) 



3.4 Matrix representation of operators 

Let £ be a linear operator, such that for harmonic function 0, ip = C<p is also a harmonic function. Assume 



further that both (f> and ifi can be expanded into series of type (29 1. There should be a linear relation be- 
tween the expansion coefficients \1/ = {ip™} and $ = {</>™}, which, generally speaking, will have a form 
\I/ = L<&, where L is a matrix, or representation of C. Of course, for a given C the matrix L depends on the 
bases over which the expansion is taken. 

Let <fi (r) be expanded over basis {F™(r)}, while tp (r) be expanded over basis {G™(r)}. The action of 
the operator £ on a basis function F™(r) can be represented as 



n 



£iC(r) 



oo 

E 

n'=0 m'=—n' 



53 L%'?G$(t), n = Q,l,..., m 



-n, ...,n, 



(31) 



where L™,™ are the reexpansion coefficients. It can be shown that the entries of matrix L are L™!? , i.e. L 
is the matrix transpose of the matrix of reexpansion coefficients. Indeed, 



oo 

E 

n=0 m=—n 



it 



E lC<CM 



n'=0 m'=—n' 

oo n' oo n oo n 

E E c'E E ^™™'g™ (r) = e E 



n'=0 m'=—n' 



n=0 m=—n 



n=0 m=—n 



oo 

E 

.n'=0 m'=—n' 



Et mm! im! 
^nn' <?n' 



(32) 



G™(r) 



6 



The reexpansion coefficients for the translation operator 7t, Eq. (|9]), in the local and multipole bases (25 1 
can be simply expressed via the respective basis functions (see |6l and lfT3l ): 



TtiC(r) = E E (^|G)l m (t)G^'(r), F,G = S,R, (33) 

n'=0 m'=—n' 

(R\R)™;™(t) = C-^'(t), ( S\R)%'™ (t) = (t) , (S|S)^(t) = U^(t). 

Here (R\R)™'™, (S\R)™'™, and (515)™'™ are the entries of the local-to-local (L2L), multipole-to-local 
(M2L), and multipole-to-multipole (M2M) translation matrices, respectively. To obtain representations of 
other operators appeared above, we use differential relations for the basis functions, which also can be found 
in 13 and lfT3l : 

= -iC-i(r), 2VC (r) = S™ +1 (r) , (34) 
V x+iy R™{v) = ^ni'W, ^+%^ (r) = (r) , 



V x . iy R™(r) = iR™ll{v), V x - iy S™ (r) = iS%? (r) 



where 



9 9 9 

^ity = ~- ±1-=-, = 7T- (35) 
ax ay az 



Appendix A provides explicit matrix representation of operators ( pL0[ > and ( |23] > required for numerical im- 
plementation of the present method. 

4 Fast multipole method 

There is an extensive literature on the FMM for the 3D Laplace equation (see, e.g., lHl|3]|9l[T2j[T5|), and we 
just present the modifications necessary to use this harmonic FMM for vortex methods. 

Note that the FMM can be considered as a way to perform a dense matrix-vector product based on 
decomposition of the matrix into sparse and dense parts 

N 

v (yj) = ^ A ( y i ,x< ) Wi= E Myj,*i)ui+ E A (yj. x i)<^i (36) 

i=i x I en(y J ) x i ^n(y i ) 

N N 

8=1 1=1 

where xi, ...,xjv are the sources, yi, yju are the receivers, £1 (yj) is a neighborhood of a box containing 
yj which determines the sparse and dense parts of the matrix A. Local summation, or sparse matrix- 
vector multiplication, is performed directly, while the dense matrix-vector product is found via generation 
of multipole expansions, translations, and evaluations of the local expansions. The present paper is about 
an efficient way to perform the dense matrix- vector product and does not consider acceleration of the sparse 
matrix-vector multiplication. 

4. 1 Complexity of the FMM 

The relative cost of the different steps of the FMM depends on the source and receiver distributions, trunca- 
tion number p, and the depth of the octree, l msiK , which hierarchically partitions the computational domain 
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Table 1 : Comparison of the FMMs for the Laplace equation and for the VEM 



FMM for Laplace equation 


FMM for VEM 


Comments 


FMM set 


Create data structure 


The same 


octree, neighbor lists, etc. 


Precompute translations 


The same 


only for a single potential 


FMM run 


Create multipole expansions 


Modified for two potentials 


see Sec. 4.3 


M2M translations 


Modified for two potentials 


Sec. 4.4 


M2L translations 


Modified for two potentials 


Sec. 4.4 


L2L translations 


Modified for two potentials 


Sec. 4.4 


Evaluate expansions 


Modified for two potentials 


Sec. 4.5 


Local summation 


Add velocity and stretching 


direct evaluation of partial sums ( 1 1 & ( 2 1 



occupied by N sources and M receivers. Simple cost estimates for the dense and sparse matrix-vector 
products can be provided for random uniform distributions and M ~ N: 

C (dense) = N f A± + ^\ (.(sparse) = S = N . 8 ~W ; (37) 

where A\ is the sum of costs of generation and evaluation of single expansions, constant A2 is determined by 
the cost of translations per box, B\ is the constant determined by the complexity of direct local summation 
for a single receiver, and s is the number of sources at level Z max , which should be found from optimization. 
Note that qualitatively different C^ spaTse " > dependence on N and s holds for some computing architectures, 
e.g. for graphics processors (GPUs) ifTBI . It is also noticeable that cost NA\ can be neglected compared 
to other costs almost in all cases (and for simplicity we neglect it as well). In any case, theoretically, the 
optimum performance for a serial CPU implementation of the FMM can be achieved when the sum of the 
costs as a function of s reaches minimum, i.e. 

c (dense) ^ ^sparse) = R ^total) ^ ^ ( g g) 

Based on this we can find theoretical complexity ratios of different versions of the FMM. As a reference 
we use a harmonic FMM for a single potential, where the gradient of potential is also computed. The oper- 
ation count and our numerical experiments (see section below) show that in this case B\ is approximately 
the same for the scalar harmonic FMM and velocity+stretching computations. Hence, only A2 is changed 
in the case of the FMM for vortex methods. Eq. ( [38] > then shows that the cost increase for a standard three 
potential representation will be \/3 times, while for the two potential representation only y/2 times, or that 
the latter method is \/3/2 rs 1.22 times faster than the former method. Note then that in practice the depth 
of the octree can be changed only discretely, and a perfect balance of the sparse and dense parts of the algo- 
rithm cannot be achieved, so some fluctuations around the value v2 are expected. Analysis of efficiency of 
implementations with fine and coarse grained parallelism, such as |[T8l . also can be done, but this requires 
particular architecture considerations, which goes beyond the scope of this paper. 

4.2 Modification of complex valued harmonic FMM 

It is proposed in |[T3l to modify an available FMM routine for complex valued harmonic function to an 
FMM routine which provides the FMM for real valued biharmonic functions. So just one complex FMM 
can be executed instead of two FMMs for the real functions. Our tests show that such an approach provides a 
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small advantage compared to the FMMs for real harmonic functions. This method can be taken and applied 
directly to the present case, since a complex valued harmonic function ^ (r) can be composed from two real 
functions <fi (r) and X ( r )> as 

(r) = 4> (r) + i X (r) . (39) 

Further the translation algorithm for (r) will be exactly the same as for the biharmonic functions, de- 
scribed in fT3l , with the only difference in the conversion operators. 



4.3 Generation of multipole expansions 

Here we propose two methods to generate multipole expansions. The first method requires just a slight 
modification of a function generating the multipole expansion of a dipole source, which are available in 
many harmonic FMM codes. The second method utilizes a function generating the multipole expansion of 
a monopole source. 



4.3.1 Method 1 

Consider multipole expansion of v/ (y ) given by Eq. ([!]) about the center, x* ,of a source box b containing 
X£ and denote r = y — x t , 17 = x; — x*. The purpose is to find coefficients of scalar potentials and x^, 
which then should be summed up with respect to I, x/ G b to obtain coefficients and X(M n f° r tne box, 
which further should be used in the translation process. 

It is not difficult to show that the auxiliary harmonic functions 

^ = r-V0, = D r &, wi = Xl + r-V X i = (Z + D r )xj, (40) 
are dipoles, 

^ = r • V x = (r l xu l )-( r ~ Tl ) , (41) 



-ui ■ (r - ri) 



(42) 



r i\ 



i.e. tp l is a dipole with moment = r; x u>i, while w\ is a dipole with moment q; = —oji. Hence, 
coefficients ip™ t and w 7 ^ can be found using the dipole expansion procedure. We can determine coefficients 
0^ and Xfa, as the operators in Eq. (40 1 are diagonal in the 5 basis (see Eqs ( 62 ) and (64 1): 



n + 1 



Wlm Xlr. 



n 



m. 



{x% = o) . 



(43) 



4.3.2 Method 2 



Using Eq. (28 1, we obtain 



(y) = V x 



r i\ 



E 9Wn (r) , F^(r)=Vx [u»,5^(r)] 



m 
9ln 



i?" m (-r ; ). (44) 



n=0 m=—n 
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Note then that F]™ is equivalent to the right hand side of Eq. ( 16 1, where one should set t = u>/, x = S'n(r). 



(/ and x' can be found from Eqs (21 1 and (23 1 i.e. 



So this function can be represented in the form provided by the left hand side of Eq. ( 16 1, where functions 

(45) 



<C (r) 



V x [«,S^(r)] = (r) + V x (rX% (r)) , 
C 12 (w,) 5-(r), X£ (r) = (C 22 - X) S™(r) 



Substituting this into Eq. ( [44] ) and using representation of the conversion operators in the S basis, Eq. (75 I, 
we obtain 



v « (y) 

<t>i (r) 
X; (r) 



where 



Ain 



V0, (r) + V x (r Xl (r)) , 

oo n oo n 

£ £ CCi2(^)^(r) = E E C^(r) 



(46) 



n=0 m=— n 

oo n 



n=0 m=— n 

oo 



E E C(C 2 2(^)-^)^(r) = E E ^ 5 ™(r) 



n=0 m=— n 



n=0 m=—n 



1 



n + 1 
1 

n 



, . .n — m+1 i . .n + m+1 
(w fa - kj|„) 9jn - (u^ + iwjj,) gfc 



iu lz mg ln 
(X& = 0) • 



(47) 



4.4 Translations 

In many FMM codes translations are performed using the rotation-coaxial translation-back rotation (RCR) 
decomposition of the translation operators, which reduces translation cost of expansions of length p 2 to 
0(p 3 ), opposed to 0(p A ) required for the direct application of the translation matrix (e.g. see lTT3l ). Such 
a decomposition is also beneficial for faster conversion, since the rotations do not change the form of de- 
composition of the vector field ([5]) and there is no need to rotate <fi and x m conversion operators. Coaxial 
translation means translation along the z direction to distance t, in which case expressions for the conversion 
operators (f74b and d75|) become even simpler (t = ti z , t x = t y = 0, t z 



R conversion 
S conversion 



n 



At 



ty. 



v = y" 



t 



A.n ' ~An-l) 

n 



n + l 



v" 

Xn+li 



(48) 



X°o) 



Figure [T] illustrates the present translation scheme which uses the RCR-decomposition (rotation-coaxial 
translation-back rotation). 

4.5 Evaluation of local expansions 
4.5.1 Velocity 

As a result of the FMM downward pass the R expansions of scalar potentials are obtained about the center 
y* ,of an evaluation box b containing receiver point y 



<M r ) = E E CO), x» = E E xZKW, r = y-y* 



(49) 



n=0 m= 



n=0 m= 



10 



Rot Coax a Rot 1 a 
■) fa™ ¥n m Vn* 



b) 



y n 



Rot 



Rot 



Coax a 
¥/' 

Coax a 



Conv. 




Wn 



Rot" 1 _ 



Rot 



Figure 1: Translation schemes for the FMM for the scalar Laplace equation (a), and the FMM for the 
DCVLE (b), based on the RCR-decomposition. The operators are shown abbreviated as follows: Rot: 
rotation, Coax: coaxial translation, Rot -1 : back (inverse) rotation. 

Cartesian components of the velocity can be obtained by projection of Eq. (|5]> to the basis vectors i x , iy, and 
i 2 as follows 

Vk = ifc • v = i fc • V0 + i fc • V x (rx) = V- lk (/) + V rxik x, k = x,y,z. (50) 
since i(.-Vx (rx) = ifc • (V% x r) = (r x i fc ) • V%. Using representations of the above operators in the R 



basis, ( |67j ) and ( |73j ), where t = ifc, we determine 

oo n 

E E CCW, k = x,y,z, 

n=0 m=—n 

2 Nn+i 1 + + {n-m) x™ +1 - (n + m) x^] , 

* [C+i 1 - C+ + l + » (n - m) x™ + 1 + i (n + m) X™" 1 ] , 



(51) 



,771 



4.5.2 Stretching term and strain tensor 

Furthermore, consider computation of the vortex stretching at evaluation point y, (rj = yj—y*), assuming 
that the strength vector at this point is Uj. The stretching is a vector 



Sj = (uj • V) v (r)| = P u .v (r)| r=r . = E ^ V ^ v k ( r ) 



(52) 



Hence, the Cartesian components of this vector can be obtained simply from computed coefficients vjj™ , Eq. 



(51 



, to which sparse operator D^,/ should be applied (see Eq. (|67|)): 

oo n 

S i k = E E S Jkn^n( r j)^ k = X,y,Z, 



(53) 



n=0 m=—n 
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In practice, it is more efficient to compute expansion coefficients u[£ n for the functions uik = ^i^fe, I, k = 
1, 2, 3, which do not depend on the evaluation point j and then form the coefficients s™ kn for each point as 

3 

S ?kn = Y. UJ i lU %n- (54) 
1=1 

Note that uik are components of the tensor Vv. The contraction (3 = Vv :Vv can then be computed (for 
compressible flows). Also, computation of Vv provides the strain tensor, which can be used for modeling 
complex fluids. 



5 Numerical tests 

For numerical tests we used our FMM software which employs RCR-decomposition of translation operators, 
and modified it for two harmonic functions. Conversion operators in the R or S basis were executed after 
coaxial translation operators, as shown in Fig. [T] Additional small modifications were used in the algorithm 
to compute R basis functions for real harmonic functions recursively, as presented in [ 15j]. In contrast to ifTSI 
no optimizations of the algorithm were used (no GPU acceleration, standard 189 M2L translation stencils, 
no variable truncation number, etc.), as the main purpose of this paper was to provide a basic comparative 
performance and accuracy test of the method, with a basic implementation. Open MP parallelization was 
used, which for a 4 core PC provided parallelization efficiency close to 100 percent. Wall clock times 
reported below were measured on an Intel QX6780 (2.8 GHz) 4 core PC with 8GB RAM. 



5.1 Error tests 

The first test we conducted is related to the numerical errors of computation of the velocity and stretching 
term. Also for comparisons we executed the FMM for a single harmonic function and measured numerical 
errors in the evaluation of the potential and its gradient. There are two basic sources of errors. The first one 
is due to truncation of the infinite series. These errors are controlled by the truncation number p (the infinite 



series (29) were replaced by the first p 2 terms, n = 0, ...,p — 1; m = —n, n), which we varied in the 
tests. The second source of errors is due to the roundoff, which in our computations with double precision in 
the range of tested p were smaller than the truncation errors (the roundoff errors were observed for p > 25). 
The basic test was performed for N sources/receivers distributed randomly and uniformly inside a cube. 
The error, was measured in the L2 relative norm based on 1000 points randomly selected from the source 
set. (Our previous tests using direct computations show that even 100 points provide sufficient confidence 
for the L2-norm error, see [ 13].) For the reference solution the velocity field, stretching term, potential and 
gradient were computed directly. 

Figure [2] illustrates behavior of the computed errors for the velocity and stretching term. For reference, 
the dependence of the respective errors in the harmonic potential and in its gradient are also shown. It is seen 
that starting with p « 7 spectral convergence is observed for all cases. It is also noticeable, that the errors in 
potential computations are substantially smaller than that for the gradient or higher derivative computations. 
There are two basic reasons for this. First, the effective truncation number for each derivative is smaller by 
one compared to the potential, and second that in the truncated term for the derivative an additional factor 
~ p appears. 



5.2 Performance tests 

For optimal FMM performance the depth of the octree Z max should be chosen to minimize the total execution 
time. For all reported test cases we conducted such an optimization. Some results of the profiling (wall 
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Figure 2: Dependences of the relative FMM errors in the L2 norm on the truncation number, p. Errors 
were computed over 1000 random points for N = 2 20 sources of random intensity distributed uniformly 
randomly inside a cube. The maximum level of space subdivision Z max = 5. 

clock time in seconds) with random uniform distributions of sources inside a cube and on the surface of a 
sphere are provided in Tables 1 and 2. In these tables v and s indicate computations of the velocity and 
stretching term for vortical flows, while <fi and V0 refer to a reference case for the scalar Laplace equation, 
where the potential or both potential and its gradient should be computed. The total initialization time, 
which includes the data structure and precomputations related to translation operators can be amortized for 
a constant source/receiver set, and is reported separately from the total run time. As one can see this time is 
relatively small, while for dynamic problems it should be added to the total run time. 

The times for the local sum ("sparse") and far field ("dense") matrix-vector products in the FMM are 
also reported. The latter is also expanded to show timing of the FMM stages. The truncation number for 
all cases was p = 12, which provides errors £2 ~ 10 -5 for the velocity and stretching term computations, 
while smaller errors for <j) and V<p (see Fig. [2]). 



Table 2: Profiling of the FMM for random uniform distribution of sources inside a cube, p = 12. 



Case 


^max 


Total Init 


S -expansion 


Upward 


Downward 


R-evaluation 


Sparse MV 


Total Run 


iV = 2 19 


v and s 


4 


0.55 


0.55 


0.04 


2.65 


0.52 


25.9 


29.7 


v alone 


4 


0.55 


0.55 


0.04 


2.65 


0.34 


16.4 


20.0 


and V</> 


5 


1.20 


0.20 


0.12 


10.7 


0.36 


3.95 


15.3 


alone 


5 


1.20 


0.20 


0.12 


10.7 


0.21 


1.89 


13.1 


N = 2 2U 


v and s 


5 


1.71 


1.05 


0.27 


25.3 


1.11 


14.2 


41.9 


v alone 


5 


1.71 


1.05 


0.27 


25.3 


0.59 


9.04 


36.3 


(f> and V</> 


5 


1.71 


0.39 


0.12 


10.7 


0.71 


15.4 


27.3 


(f> alone 


5 


1.71 


0.39 


0.12 


10.7 


0.43 


7.32 


19.0 
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Table 3: Profiling of the FMM for random uniform distribution of sources on a sphere surface, p = 12. 



Case 


^max 


Total Init 


S -expansion 


Upward 


Downward 


R-evaluation Sparse MV 


Total Run 


iV = 2 iy 


v and s 


7 


1.23 


0.88 


0.34 


9.48 


0.66 


3.75 


15.1 


v alone 


6 


0.66 


0.63 


0.10 


2.46 


0.34 


8.66 


12.2 


and V</> 


7 


1.23 


0.23 


0.15 


3.93 


0.41 


3.90 


8.62 


4> alone 


7 


1.23 


0.23 


0.15 


3.93 


0.26 


1.86 


6.43 


N = 2 2U 


v and s 


7 


1.81 


1.29 


0.34 


9.48 


1.30 


14.1 


26.5 


v alone 


7 


1.81 


1.29 


0.34 


9.48 


0.70 


8.86 


20.7 


and V</> 


7 


1.81 


0.46 


0.15 


4.14 


0.84 


15.3 


20.9 


cj) alone 


7 


1.81 


0.46 


0.15 


4.14 


0.52 


7.30 


12.6 



The tables show that in the cases when the number of translations for a single potential <fi for the scalar 
Laplace equation and coupled potentials and \ f° r the DCVLE are the same (the same Z max ) the translation 
time for the latter case is approximately twice, as expected. Deviations may be explained by two factors. 
First, this is due to increase in the size of the arrays representing expansions and more time needed for data 
access, and, second, by the presence of the conversion operators. The tables also show that the time for 
sparse matrix- vector products for velocity only computations in DCVLE is slightly larger than for potential 
only computations in a harmonic FMM, while the time for the same operations for velocity and stretch- 
ing computations are slightly smaller than for potential and gradient computations. Note, however, that if 
an additional near-field kernel should be computed, which may involve computation of special functions 
(exponents, error integrals, etc.) the time for the sparse matrix-vector product would increase, while the 
translation part would not be affected. Also note that, theoretically, in the optimized algorithm, an increase 
of the complexity of the sparse matrix- vector product by a factor of k affects the total complexity as \fk (see 
Eq. p8|)). The ratio of the total time for the velocity and stretching computations to the time of potential and 
gradient computations depends on the problem. In all our numerical experiments this ratio never exceeded 
2 (except for one outlier at N = 1024, see Fig. [4]). 

Figure [3] illustrates dependence of the wall clock time on the number of sources N, which in all cases 
was set to be equal to the number of receivers. It is seen that at large N the algorithm scales linearly, and 
the time for velocity and stretching term computations is always larger than that for scalar potential only 
computations by a factor of approximately two. Figure [4] illustrates the wall clock FMM run time ratio of 
velocity and stretching to potential and gradient computations for different p and N = 2 k , k = 10, ...,20. It 
is seen that for k > 11 this ratio is larger than unity and smaller than two with a mean value about 1.405, 
while the average over all points shown in this figure is 1.423, which agrees well with theoretical value 
\/2 (see discussion after Eq. (38 1). Finally, we can see that the velocity and stretching computations are 
approximately twice as expensive as computation for a single harmonic potential. This also agrees well with 
the theory, since the cost of sparse matrix- vector product for a single harmonic potential approximately two 
times smaller than that for the velocity and stretching computations. Hence, in Eq. (38 1 both coefficients A2 
and B\ for the latter case are two times larger than that for the former case, which provides exactly factor 2. 



5.3 Example vortex computations 

We implemented vortex particle and vortex filament methods (VPM and VFM, respectively) accelerated by 
the FMM described above. Some results for test problems related to the vortex ring dynamics are presented 
below. 

In the VFM the total velocity field is a superposition of the Biot-Savart contour integrals taken along the 
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Figure 3: The wall clock time for the FMM for computation of vortical and potential flows (different terms 
and compinations). The straight line shows linear dependence. For all cases the sources are distributed 
randomly and uniformly inside a cube; the truncation number is constant, p = 12. 

vortex filaments. The field of a single filament, C, can be discretized as 



v(y) = ^ r (x)a(x)x i y - x) =f>(y). P5) 

i=l 



4?r Jc |y-x| 3 f-f 



_ IW *W*(y-*) t c = u£iCii 
47r -/Cj |y - x| J 

where is the number of elements and each element C; can be assumed a line segment of constant circu- 
lation Tj, in which case the integrals can be computed analytically (see Appendix B), 



Vl (y) = ^ ^ X ^ (— + —) r»=y-iW r W 

47T _(l) r (2) , (1) . (2) I (1) + (2) J ' y X * ' 



J = 1,2, (56) 



where , j = 1,2, are coordinates of the end points of Ci and condition Vj (y) = is imposed for y G Cj. 
Multipole expansion in terms of potentials and \ can be found using quadrature (see Appendix B). In the 
VFM vortex stretching occurs naturally since the ends of each segment propagate with the velocity of fluid 
particles at those points (e.g. see S3)- Hence in the FMM for the VFM the element centers are considered 
as "sources", while the end points are used as the "receivers". 

For numerical examples we used the following smoothing kernel 

T / 2 T ( \ 

K(r;a) = erf—-^--e, P ^-— 2 j, (57) 
which corresponds to the Gaussian vortex blob function with standard deviation a. 
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Figure 4: The ratio of the FMM run time for computation of the velocity and stretching term in a vortical flow 
to the respective time for the potential and gradient computations in a potential flow for different truncation 
numbers p and number of sources, N, which were randomly and uniformly distributed inside a cube. All 
data for N > 2048 are located between the dashed lines. 

5.3.1 Single vortex ring computations 

The self-induced motion of a vortex ring of radius R in an inviscid incompressible fluid is a classical solution 
lfT6l l20l . A characteristic of this problem is that the velocity of the ring itself should be infinitely large in 
this case, and the way to fix this is to introduce a small vortex core of size 5, in which case 

^~^( ta T-i)' (58 > 

The asymptotic behavior of the VFM scheme at large N can be estimated theoretically (see Appendix B). 

T 2 N 

K (self) ~ T^ ln — , N -> oo. (59) 
AttR it 

In Fig. [5] we compared the velocity of the ring obtained via VFM and VPM (average of all vortex 



element velocities) with theoretical prediction (59 1. The ring velocity found via the VFM using the brute 



force method (no FMM) is about 1 % different from the value provided by Eq. ( 59 ) and this difference 



decreases slowly with increasing N. So this error is not related to the use of the FMM, which at p = 12 



and p = 25 are several orders smaller, and can be referred to the discretization scheme. Despite Eq. (81 1 
provides consistency of the quadrature and FMM truncation errors it is sufficient to use N q = 2 and even 
N q = 1 at smaller p or very large N. This can be explained by the fact that the polygon approximation of 
the contour results in a globally non-smooth integrand. In this case high order quadratures do not improve 
accuracy, but the increase of the number of collocation points in a low order quadrature does. In the VPM 



we specified the characteristic vortex blob size in Eq. (57 1 as a = 2irR/N c . For N <C N c the vortex blobs 



do not overlap, while at N S> N c the overlapping is high and we have a ring of constant radius 5 ~ a, which 



does not depend on N and so the ring has a constant velocity (58 1. We also computed the velocity field 
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using N = 10 6 , N c = 5 • 10 5 for the VPM and N m 5 • 10 5 for the VFM, which provides approximately the 
same ring velocity. At p = 25 the relative error between the numerical and analytical solutions (Loo-norm) 
did not exceed 5 • 10~ 9 (eight digits) for the distances larger than 25, where 5 was found from Eq. (58 ). 




VFM, y=0 |V| VPM, y=0 l v l 




-1 -0.5 0.5 1 -1 -0.5 0.5 1 



Figure 5: The dependences of the vortex ring velocity and computational errors on the number of discretiza- 
tion elements N obtained by the VFM and VPM (the top raw) and the respective velocity fields for N = 10 6 
(VPM) and N = 490960 (VFM) (the bottom raw) for R = 1 and F = 2-it. The theoretical velocity is given 



by Eq. (59 1. The dashed lines for the ring velocity correspond to the VPM computations with vortex par- 
ticle size parameter a = 2ttR/N c , where N c is shown near the curves. The errors are plotted for different 
parameters values of p and N q controlling the accuracy of the FMM accelerated VFM. The discretization 
error is due to the approximation of the circle by linear elements and is not related to the FMM. 

The brute force computations of the ring dynamics using the VFM or VPM are stable, since all vortex 
elements are in symmetric positions. However, the FMM errors introduce noise and asymmetry, which 
result in much faster development of instabilities. So for ring dynamics computations at each time step we 
applied a O (N) stabilization procedure using a low-pass FFT-based filtering of the curve shape. 
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5.3.2 Pair of vortex rings 



Motion along the z-axis of two vortex rings of radii R\ (t) and R 2 (t) and positions Z\ (t) and Z 2 (t) can be 
found from the solution of the initial value problem for the system of four ODEs 

^ = V£>. ( Rj , Zj ; R 3 .„ Z 3 _i) , ^ = V} sdf) (Rj)+V^) (Rj, Z f ,R 3 ^, Z 3 _,) , j = 1, 2, (60) 

where vj r ^ and Vj are components of the velocity field Vj (r, z; Rj, Zj) of the jth ring in cylindrical 

j 



coordinates (r, z), and vj~ sel (Rj) is the self-induced velocity, Eq. (58l. We integrated Eq. (60l using the 



4th order Runge-Kutta method with controlled relative error 10 9 . 

Numerical tests were performed for two cases with the same initial conditions Ri (0) = #2(0) = 1, 
Zi (0) = 0, Z 2 (0) = 0.1 and ring core radii <5i = 5 2 = 5 = 9.966931 • 10~ 3 , but different circulations 
Ti = —T2 = 1 and Ti = F 2 = 1. The first case produces colliding rings, while the second case results 
in leapfrogging rings, shown in Figs [6] and [7] Vortex element simulations were performed using the VPM 
with N = 5 ■ 10 4 elements per ring and blob size a = 27r/N c , N c = 10 3 , which determines the ring core 
radii provided above during all computed times (N c <C N; for the colliding rings the initial radius increase 
was about 3 times at t = i max ~ 0.4). The time integration in the VPM was performed using a 4th order 
Runge-Kutta method with a constant time step At = 5 • 10~ 4 and an FFT-based low-pass shape filter (11 
Fourier modes retained for each ring). In all cases we used the FMM with p = 25. All errors were measured 
as the maximum relative errors, (L^ norm). For the case of colliding rings positional (max of Z s and 
R's) error = 1.5 • 10~ 3 for «; t < t max , but = 1.4 • 10~ 5 for t < t\ and = 1.2 • 10~ 6 
for ^ t ^ t 2 , where at t± the ring cores touch (Z 2 — Z\ = 25) and at t = t 2 we have Z 2 — Z\ = 45. 
For the case of leapfrogging rings positional error < 6.2 • 10~ 8 . This shows that the FMM velocity 
computations are accurate and a larger error in the first case, presumably, has a non-FMM related nature 
(due to the Gaussian spread of the vortex blobs). Errors in the velocity fields in Figs [6] and [7] at distances 
larger than 25 were approximately the same for both cases, 8.9 • 10~ 5 and 1.7 • 10~ 4 , respectively. 



6 Conclusion 

The main goal of this study was to develop an efficient method for fast summation of elementary vortices. 
Numerical tests confirm the validity of the theory presented and efficiency of the method. Our numerical 
results show that one should expect an increase of the computation time by a factor of approximately two for 
the velocity and vorticity stretching term computations compared to a single harmonic function computa- 
tions for the same FMM octree and the truncation number. Compared to potential and gradient computations 
for the scalar Laplace equation this increase varies in the range from 1 to 2 times (average wl.4) depending 
on particular source/receiver distribution, truncation number, etc. These numbers are in a good agreement 
with the theoretical FMM cost estimates. 

An interesting observation from the study is that a general reconstruction of vector fields from given 
curl and divergence can be obtained via the present method, which operates only with two scalar potentials. 
This may have application to many other fields of physics, including plasma physics, electromagnetism, etc. 
In this sense the DCVLE appears to be a fundamental equation, the solutions of which can be accelerated 
via the harmonic FMM. As is shown, modifications of standard FMM programs are relatively easy, and 
require tracking of two harmonic functions, and implementation of the conversion operators used in each 
translation. Such operators are very sparse and simple (especially for the case of the RCR-decomposition) 
and their execution does not create substantial overheads. In terms of further acceleration of computations 
it is natural to consider implementations of the method on graphics processors (GPUs) for which the vortex 
methods are developed and tested (e.g. [25]). 
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Figure 6: A comparison of dynamics of colliding rings (Ti = — = 1), obtained by the VPM accelerated 
by the FMM (p = 25) with analytical solution. Only ring 1 is shown in the upper left picture. The velocity 
fields near the ring cores are shown for t = 0.2. In the plot of analytical solution the velocity field is zeroed 
within the core radius 5 « 10~ 2 to show the radius. 

We also conducted some tests for the vortex filament and vortex particle methods using the algorithm 
developed. These tests show a good accuracy of computations compared to analytical solutions and indicate 
that the FMM errors are typically much smaller than the errors of the VFM and VPM schemes introduced by 
line integral discretization, vortex blob representation, time integration errors, and interpolation procedures. 
The FMM brings the entire VEM method to O(N) complexity for highly clustered non-uniform data dis- 
tributions (e.g. vortex lines and sheets). Such acceleration is important for numerous practical applications, 
e.g. for computation of vortical fields generated by helicopters ifTTl . while for such large scale simulations 
further study is required. 
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Figure 7: The same as in Fig. [6j but for leapfrogging rings (Ti = T2 = 1). In the upper raw of pictures the 
data for ring #1 are plotted by the thin lines and markers, while the data for ring #2 are plotted by the thick 
lines and markers. 
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A Matrix representation of differential and conversion operators 
A.l Operator V r 



To obtain the representation of this operator there is no need to consider differential relations, since accord- 

(61) 



ing to definition ( 10 1 we have V 

P r iC(r)=niC(r), V t £% (r 



r (d/dr) and from Eq. (25 1 we have 
-(n + l)5-(r 



n 



0,1,... 



(Ft) ( 

This shows that matrices Dr and DJ. representing this operator are diagonal and have entries 
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/ 1 



D< s > 



0,1,... 
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where 5 mm > is the Kronecker symbol. 

Conversion operators (23 1 contain inverse operators V^ 1 and (Z + V r )~ 1 , which also are diagonal. It 

( Ft) 

may be a cause for concern for the inverse operators that zeros appear on the diagonal of matrix Dj. and on 

(S) 

the diagonal of matrix I + Df. at n = 0. However, these are easily dispensed with. For the former case we 
note that harmonic n = corresponds to a constant basis function Rq (r) . Eq. (22 1 shows that this affects 
only the constant added to potential 4>, which obviously does not affect the velocity field (|5]>, and can be set 
to an arbitrary value, e.g. to zero. For the latter case, Eq. (Ill shows that harmonic n = in multipole 
expansion of function x also does not affect the velocity field, since 
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Since operator ^1 + D^'j is needed only to determine converted function \ ( see Eq. (22 1 this also 
can be set to zero. In other words, for the purpose of computation of the conversion operatorsme inverse 
operators for singular matrices can be defined as follows 
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A.2 Operator V t 

From definitions ( [T0| and ( [35] ) we have 



d d d 1 
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Using Eq. ( 34 1 we obtain 
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- [(t y + it x ) R™+1 - (ty - it x )R™_i] - MC-i> 
= \[{ty + it*)S™£-{t y -it x )S™£]-t z S™ +1 . 
Taking into account that matrices and are transposed to the reexpansion matrices, we obtain 
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A.3 Operator V rxt 



From definitions ( |T0| ) and ([35]) we have 
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Consider first action of this operator on basis functions R™ (r). The following relations derived in |[T3l are 
useful in this case: 

UK = ^ n + r H + 2 KX-^K +1 , (69) 
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We have then, using these relations and Eq. ( [34] ) 
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Now we obtain from Eq. (68 ) 
VrxtK = (*x + it 



^ w-m+l pm _j 
;,) ' 2 



t n L +m + l pm+1 
2 



iC " (** " Hy) ' ' ' ~ K + " " 
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To get a similar relation for basis functions S™ (r) we can use relation (27 1. Using identity (r x t) • 
^ [/ ( r ) 3 ( r )] = / ( r ) ( r x *) " ^3 ( r ) an d Eq. ( [7l] ), we obtain 
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^rxt^r = (-l) n+m (n-m)!(n + m)!r- 2 ' l - 1 P rxtJ R- 



Expressions for the representing matrices for the local and multipole bases follow from Eqs (71 1 and 
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A.4 Conversion operators 



It is not difficult to obtain matrix representations for the conversion operator from Eqs ( 14 1 and (23 1 and 
expressions for the differential operators derived above. A more compact form relating the expansion coef- 



ficients of functions in Eq. ( 22 1 for the R expansions is 
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Similarly, for the 5 expansions we have 
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B Some line integrals 

B.l Velocity field of linear element 



Consider velocity field, Eq. (55), of a linear vortex element of constant circulation T with end points 
and x^ 2 ) . This integral diverges for y G C. For y ^ C we change the integration variable as 



x = x( c ) + ^, xW = ^(x^+xW), l = x( 2 )-x« dl(x) = ilde, 
to obtain 
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The definite integral here can be computed using the primitive (can be checked by differentiation) 
d£ b 2 i + (a • b) 
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Further we note that all vectors in the resulting formula according to Eq. (76) can be expressed in terms of 
r-W = y — x^ 1 ) and r( 2 ) = y — x( 2 \ which results in expression (56 1. 



B.2 Far field expansions 



Far field expansion of the integrand for elementary velocity field in Eq. ( 77 ) can be obtained similarly to 
derivation of Eq. (|47|) from Eq. ([44]). Indeed, we have for expansion center x#, 
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(79) 



Hence, Eqs (45 1 and (46 ) can be used to expand potentials <fi and x into the harmonic series. Coefficients of 
these series are provided by Eq. (47 1, where one simply should use ^1 instead of vector ui and coefficients 
g™ instead of g 1 "^. The problem then is to compute in Eq. (79 1. These coefficients can be computed in a 



very straightforward way using Gauss-Legendre quadrature with weights Wj and abscissas £ • (see (JJ), 
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Note now that functions R n m (— + Jl^) are polynomials of degree n of ^ (see lfT3ll ). Hence, the Gauss- 



Legendre quadrature (80) provides an exact result for N q > n/2. Furthermore, for application of the FMM 
we truncate all series with p terms to provide a required accuracy. Therefore, the range of n needed is limited 
as n ^ p — 1, and choice 
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+ 1, 
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provides an exact result for all harmonics needed for application of the FMM. 



B.3 Velocity of the vortex ring without self-induction of small elements 

We exclude an e vicinity of a point y on the ring by putting its self-induction to zero. In this case we have 

r r di(x)x( y -x) 
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Now for a ring in z plane and y located on the x axis we can express Cartesian coordinates of vectors in Eq. 



( 82 1 as 
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where ip is the polar angle for the reference frame centered at the ring center. Substituting this into Eq. ( 82 1, 
we obtain 
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If the size of the line element in the VFM is /, then eR « 21 and the total number of elements is N = 2-kR/I. 
Hence 
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We also can determine the "effective" radius of the core for a given discretization (compare Eqs ( p8[ ) and 
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